Enter the name of the tissue you want to analyze.
tissue_of_interest = "Liver"
library(here)
source(here("00_data_ingest", "02_tissue_analysis_rmd", "boilerplate.R"))
tiss = load_tissue_facs(tissue_of_interest)
Performing log-normalization
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
**************************************************|
[1] "Scaling data matrix"
|
| | 0%
|
|====================================================================================================================| 100%
Calculating gene means
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating gene variance to mean ratios
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
**************************************************|
We can visualize top genes in each principal component.
We then project onto just the top principal components. This has the effect of keeping the major directions of variation in the data and, ideally, supressing noise. A decent rule of thumb is to pick the elbow in the plot below.
PCElbowPlot(object = tiss)
Choose the number of principal components to use.
n.pcs = 11
# Set resolution
res.used <- 1
tiss <- FindClusters(object = tiss, reduction.type = "pca", dims.use = 1:n.pcs,
resolution = res.used, print.output = 0, save.SNN = TRUE)
We use tSNE solely to visualize the data.
tiss <- RunTSNE(object = tiss, dims.use = 1:n.pcs, seed.use = 10, perplexity=30)
TSNEPlot(object = tiss, do.label = T, pt.size = 1.2, label.size = 4)
Check expression of genes useful for indicating cell type.
genes_hep = c('Alb', 'Ttr', 'Apoa1', 'Serpina1c') #hepatocyte
genes_endo = c('Pecam1', 'Nrp1', 'Kdr','Oit3') # endothelial
genes_kuppfer = c('Emr1', 'Clec4f', 'Cd68', 'Irf7') # Kuppfer cells
genes_nk = c('Zap70', 'Il2rb', 'Nkg7', 'Cxcr6') # Natural Killer cells
genes_b = c('Cd79a', 'Cd79b', 'Cd74', 'Cd19') # B Cells
genes_all = c(genes_hep, genes_endo, genes_kuppfer, genes_nk, genes_b)
In the tSNE plots below, the intensity of each point represents the log-normalized gene expression \(N_{ij}\).
Dotplots show, for each cluster and gene, the fraction of cells with at least one read for the gene (circle size) and the average scaled expression for that gene among the cells expressing it (circle color).
The low but nonzero levels of Albumin present in all clusters is consistent with a small amount of leakage, either through physical contamination or index hopping. Nevertheless, the absolute levels of expression confirm a sharp difference between the hepatocyte clusters and the others.
To confirm the identity of a cluster, you can inspect the genes differentially expressed in that cluster compared to the others.
clust.markers7 <- FindMarkers(object = tiss, ident.1 = 7,
only.pos = TRUE, min.pct = 0.25, thresh.use = 0.25)
| | 0 % ~calculating
|+ | 1 % ~01m 45s
|++ | 2 % ~01m 26s
|++ | 3 % ~01m 33s
|+++ | 4 % ~01m 27s
|+++ | 5 % ~01m 26s
|++++ | 6 % ~01m 21s
|++++ | 7 % ~01m 21s
|+++++ | 8 % ~01m 21s
|+++++ | 9 % ~01m 17s
|++++++ | 10% ~01m 15s
|++++++ | 11% ~01m 12s
|+++++++ | 12% ~01m 10s
|+++++++ | 13% ~01m 08s
|++++++++ | 14% ~01m 06s
|++++++++ | 15% ~01m 04s
|+++++++++ | 16% ~01m 03s
|+++++++++ | 17% ~01m 01s
|++++++++++ | 18% ~60s
|++++++++++ | 19% ~59s
|+++++++++++ | 20% ~57s
|+++++++++++ | 21% ~56s
|++++++++++++ | 22% ~56s
|++++++++++++ | 23% ~55s
|+++++++++++++ | 24% ~54s
|+++++++++++++ | 25% ~54s
|++++++++++++++ | 26% ~53s
|++++++++++++++ | 27% ~52s
|+++++++++++++++ | 28% ~51s
|+++++++++++++++ | 29% ~50s
|++++++++++++++++ | 30% ~49s
|++++++++++++++++ | 31% ~48s
|+++++++++++++++++ | 32% ~47s
|+++++++++++++++++ | 33% ~47s
|++++++++++++++++++ | 34% ~46s
|++++++++++++++++++ | 35% ~45s
|+++++++++++++++++++ | 36% ~44s
|+++++++++++++++++++ | 37% ~43s
|++++++++++++++++++++ | 38% ~42s
|++++++++++++++++++++ | 39% ~42s
|+++++++++++++++++++++ | 40% ~41s
|+++++++++++++++++++++ | 41% ~40s
|++++++++++++++++++++++ | 42% ~39s
|++++++++++++++++++++++ | 43% ~38s
|+++++++++++++++++++++++ | 44% ~38s
|+++++++++++++++++++++++ | 45% ~37s
|++++++++++++++++++++++++ | 46% ~36s
|++++++++++++++++++++++++ | 47% ~35s
|+++++++++++++++++++++++++ | 48% ~35s
|+++++++++++++++++++++++++ | 49% ~34s
|++++++++++++++++++++++++++ | 51% ~33s
|++++++++++++++++++++++++++ | 52% ~33s
|+++++++++++++++++++++++++++ | 53% ~32s
|+++++++++++++++++++++++++++ | 54% ~31s
|++++++++++++++++++++++++++++ | 55% ~30s
|++++++++++++++++++++++++++++ | 56% ~30s
|+++++++++++++++++++++++++++++ | 57% ~29s
|+++++++++++++++++++++++++++++ | 58% ~28s
|++++++++++++++++++++++++++++++ | 59% ~28s
|++++++++++++++++++++++++++++++ | 60% ~27s
|+++++++++++++++++++++++++++++++ | 61% ~26s
|+++++++++++++++++++++++++++++++ | 62% ~26s
|++++++++++++++++++++++++++++++++ | 63% ~25s
|++++++++++++++++++++++++++++++++ | 64% ~24s
|+++++++++++++++++++++++++++++++++ | 65% ~24s
|+++++++++++++++++++++++++++++++++ | 66% ~23s
|++++++++++++++++++++++++++++++++++ | 67% ~22s
|++++++++++++++++++++++++++++++++++ | 68% ~22s
|+++++++++++++++++++++++++++++++++++ | 69% ~21s
|+++++++++++++++++++++++++++++++++++ | 70% ~20s
|++++++++++++++++++++++++++++++++++++ | 71% ~19s
|++++++++++++++++++++++++++++++++++++ | 72% ~19s
|+++++++++++++++++++++++++++++++++++++ | 73% ~18s
|+++++++++++++++++++++++++++++++++++++ | 74% ~17s
|++++++++++++++++++++++++++++++++++++++ | 75% ~17s
|++++++++++++++++++++++++++++++++++++++ | 76% ~16s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~15s
|+++++++++++++++++++++++++++++++++++++++ | 78% ~15s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~14s
|++++++++++++++++++++++++++++++++++++++++ | 80% ~13s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~13s
|+++++++++++++++++++++++++++++++++++++++++ | 82% ~12s
|++++++++++++++++++++++++++++++++++++++++++ | 83% ~11s
|++++++++++++++++++++++++++++++++++++++++++ | 84% ~11s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~10s
|+++++++++++++++++++++++++++++++++++++++++++ | 86% ~09s
|++++++++++++++++++++++++++++++++++++++++++++ | 87% ~09s
|++++++++++++++++++++++++++++++++++++++++++++ | 88% ~08s
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~07s
|+++++++++++++++++++++++++++++++++++++++++++++ | 90% ~07s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~06s
|++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~05s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~05s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 94% ~04s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~03s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96% ~03s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~02s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed = 01m 06s
The top markers for cluster 7 include histocompatibility markers H2-*, consistent with the expression of other B-cell markers seen above.
head(clust.markers7)
Using the markers, we can confidentaly label the clusters. We provide both a free annotation (where anything name can be used) and a cell ontology class. The latter uses a controlled vocabulary for easy comparison between studies and different levels of the taxonomy.
tiss <- StashIdent(object = tiss, save.name = "cluster.ids")
cluster.ids <- c(0, 1, 2, 3, 4, 5, 6, 7, 8)
free_annotation <- c(
"endothelial cell",
"hepatocyte",
"hepatocyte",
"hepatocyte",
"hepatocyte",
"kuppfer",
"hepatocyte",
"B cell",
"NK/NKT cells")
cell_ontology_class <-c(
"endothelial cell of hepatic sinusoid",
"hepatocyte",
"hepatocyte",
"hepatocyte",
"hepatocyte",
"Kupffer cell",
"hepatocyte",
"B cell",
"natural killer cell")
tiss = stash_annotations(tiss, cluster.ids, free_annotation, cell_ontology_class)
Color by metadata, like plate barcode, to check for batch effects. Here we see that the clusters are segregated by sex.
TSNEPlot(object = tiss, do.return = TRUE, group.by = "mouse.id")
Nevertheless, every cluster contains cells from multiple mice.
table(FetchData(tiss, c('mouse.id','ident')) %>% droplevels())
ident
mouse.id 0 1 2 3 4 5 6 7 8
3_11_M 160 0 59 29 51 47 0 31 32
3_56_F 0 46 0 11 0 0 25 0 0
3_57_F 0 46 0 5 0 0 28 0 0
3_9_M 22 1 32 37 20 14 1 10 7
Color by cell ontology class on the original tSNE.
TSNEPlot(object = tiss, group.by = "cell_ontology_class")
filename = here('00_data_ingest', '04_tissue_robj_generated',
paste0("facs_", tissue_of_interest, "_seurat_tiss.Robj"))
print(filename)
[1] "/Users/josh/src/tabula-muris/00_data_ingest/04_tissue_robj_generated/facs_Liver_seurat_tiss.Robj"
save(tiss, file=filename)
# To reload a saved object
#filename = here('00_data_ingest', '04_tissue_robj_generated',
# paste0("facs_", tissue_of_interest, "_seurat_subtiss.Robj"))
#load(file=filename)
save_annotation_csv(tiss, tissue_of_interest, "facs")